Skip to content

feat(trust): gateway echo-on-deny (Phase 2) + trust-none default (Phase 3)#1273

Open
chaodu-agent wants to merge 2 commits into
mainfrom
feat/trust-gateway-echo-denyflip
Open

feat(trust): gateway echo-on-deny (Phase 2) + trust-none default (Phase 3)#1273
chaodu-agent wants to merge 2 commits into
mainfrom
feat/trust-gateway-echo-denyflip

Conversation

@chaodu-agent

Copy link
Copy Markdown
Collaborator

Enables trust-none for the gateway path (Telegram/LINE/…) with the request-access echo, so you can see the deny + echo UX end-to-end. Scoped intent: only pre-beta (self-use) runs this; emilie is unaffected (its own image + WS path, which doesn't use this registry/echo).

Changes

  • Phase 2 — echo on deny: when the gate returns DenyIdentity, reply to the sender with their ID + how to request access (adapter.send_message). Throttled to 1 echo per (platform, sender) / 5 min (LazyLock map) to prevent amplification. DenyScope stays silent (scope isn't a security boundary).
  • Phase 3 (gateway) — trust-none default: flip GATEWAY_ALLOW_ALL_USERS default true → false. Gateway L3 is now deny-all unless GATEWAY_ALLOWED_USERS lists the sender (or GATEWAY_ALLOW_ALL_USERS=true). L2 (channels) stays open.

⚠️ Behavior change (intended)

Gateway deployments on this image now deny unknown users by default. To keep a bot working:

  • set GATEWAY_ALLOWED_USERS=<uid,…>, or
  • set GATEWAY_ALLOW_ALL_USERS=true (old behavior).

Discord/Slack registry entries are unchanged (still behavior-preserving allow-all). Deviates from the strict #1269 phase order (deny-flip before privatization) — accepted because it ships only to pre-beta (self-use) for validation.

How to test (Telegram)

  1. On the pre-beta Telegram bot, set GATEWAY_ALLOWED_USERS=<your uid> → you're admitted.
  2. Message from a non-listed account → no agent action + you get: ⚠️ You are not on this bot's trusted list. Your ID: <id> … (once per 5 min).

Testing

  • clippy --features unified -D warnings clean; 10/10 gateway tests (incl. new echo_allowed_throttles_repeat_within_window).

Refs #1264 #1269

- Phase 2: on Decision::DenyIdentity, echo the sender their ID via
  adapter.send_message so they can request access (request-access UX).
  Throttled to 1 echo per (platform,sender) per 5min (LazyLock map) to
  prevent amplification. DenyScope stays silent (not a security boundary).
- Phase 3 (gateway): flip GATEWAY_ALLOW_ALL_USERS default true→false, so
  gateway L3 is trust-none by default. L2 (channels) stays open. Admit via
  GATEWAY_ALLOWED_USERS or GATEWAY_ALLOW_ALL_USERS=true.

Ships in pre-beta (self-use). emilie unaffected (own image + WS path, which
doesn't use this registry/echo). Discord/Slack registry entries unchanged
(still behavior-preserving allow-all).

Refs #1264 #1269
@chaodu-agent chaodu-agent requested a review from thepagent as a code owner July 1, 2026 12:35
@chaodu-agent

Copy link
Copy Markdown
Collaborator Author

LGTM ✅ — Phase 2 echo-on-deny and Phase 3 trust-none default are well-implemented, correctly scoped, and ship with appropriate safeguards.

What This PR Does

Implements the gateway trust echo for denied identity requests (Phase 2) and flips the GATEWAY_ALLOW_ALL_USERS default from true to false (Phase 3), enabling a deny-all-by-default posture for gateway-pathed platforms.

How It Works

  • Replaces the simple if !decision.is_allowed() block with a match on the three Decision variants:
    • Allow → pass through (noop)
    • DenyIdentity → send a throttled echo showing the user their ID + how to request access, then drop
    • DenyScope / wildcard → silent drop (no information leakage)
  • Throttle state: global LazyLock<Mutex<HashMap<String, Instant>>> with a 5-minute window per (platform, sender) key prevents amplification abuse.
  • Default flip in main.rs is a one-line change with clear comments documenting the migration path (GATEWAY_ALLOW_ALL_USERS=true or list GATEWAY_ALLOWED_USERS).

Findings

# Severity Finding Location
1 🟢 Throttle prevents amplification — good anti-abuse design gateway.rs:1144-1162
2 🟢 DenyScope is correctly silent — no scope config leakage gateway.rs:1228-1239
3 🟢 Clear migration path documented in PR description and code comments main.rs:280-283
4 🟢 Unit test covers the throttle edge case without global state pollution gateway.rs:1476-1482
What's Good (🟢)
  • Anti-abuse throttle: The 5-minute per-(platform, sender) rate limit is the right choice. It lets a legitimate user see their ID once, while preventing an attacker from using the echo as an amplification vector.
  • Silent DenyScope: Using a wildcard arm (_) for DenyScope and future variants means new deny reasons default to silent — correct security posture.
  • Phased rollout: Only gateway-pathed platforms (Telegram, LINE, etc.) are affected. Discord/Slack registry entries are explicitly unchanged. Pre-beta scoping limits blast radius.
  • Self-documenting flip: The Phase 3 comment in main.rs links context (was true in feat(trust): Phase 1 (gateway) — route gateway ingress through the shared gate #1267, now false) so future readers understand the history.
Baseline Check
  • PR opened: 2026-07-01
  • Main already has: Decision enum with Allow/DenyScope/DenyIdentity, gate_incoming call in gateway, simple deny block with TODO comment for Phase 2 echo, GATEWAY_ALLOW_ALL_USERS defaulting to true
  • Net-new value: actual echo implementation with throttle (Phase 2), default flip to deny-all (Phase 3)
Note: Unbounded throttle map (non-blocking)

The ECHO_THROTTLE HashMap grows without eviction — stale entries for senders who never return are never removed. For pre-beta with a handful of users this is fine. If this path ever handles high-cardinality traffic, consider periodic eviction or a bounded LRU cache. Not blocking since this is scoped to pre-beta validation.

@pahud pahud left a comment

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Review

Code correctness:

  • Decision match in process_gateway_event is exhaustive (Allow / DenyIdentity / DenyScope via _) — matches the enum in trust.rs. Good use of _ to also catch any future variant defensively.
  • Echo throttle (ECHO_THROTTLE + echo_allowed) is a simple per-(platform, sender) 5-min window backed by a Mutex<HashMap> — correct and adequately tested (echo_allowed_throttles_repeat_within_window). Fine for pre-beta self-use scale; would want a TTL/eviction sweep if this ever sees real traffic (unbounded map growth), but not a blocker at current scope.
  • DenyScope staying silent (no echo) is correctly scoped per the L2/L3 trust model from the ADR.
  • The GATEWAY_ALLOW_ALL_USERS default flip (true → false) is a single, well-isolated line change with a clear comment explaining the behavior change.
  • send_message errors are intentionally swallowed (let _ = ...) on the echo path — reasonable, since a failed echo shouldn't fail event processing, though consider at least a tracing::warn! on error for observability.

Scope/safety:

  • Correctly scoped to pre-beta only per the PR description (emilie unaffected).
  • Discord/Slack registries unchanged — consistent with the stated deviation from strict phase ordering.

Gap found: docs/config-reference.md still documents GATEWAY_ALLOW_ALL_USERS default as true. This PR changes the code default to false but doesn't update that doc table — it'll be actively wrong once merged. Not a blocker, but should be fixed in this PR or immediately after.

CI: all checks green (33/33 smoke tests, clippy, unit tests).

Approving — please push a doc fix for the GATEWAY_ALLOW_ALL_USERS default in docs/config-reference.md before or right after merge.

@thepagent thepagent left a comment

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Approved as code owner. See detailed review from @pahud above — code is correct (exhaustive Decision match, tested throttle, well-scoped default flip). One follow-up needed: docs/config-reference.md still lists GATEWAY_ALLOW_ALL_USERS default as true; should be updated to false to match this change.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants